Overview

Dataset statistics

Number of variables21
Number of observations2000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory328.2 KiB
Average record size in memory168.1 B

Variable types

Numeric14
Categorical7

Warnings

fc is highly correlated with pcHigh correlation
four_g is highly correlated with three_gHigh correlation
pc is highly correlated with fcHigh correlation
px_height is highly correlated with px_widthHigh correlation
px_width is highly correlated with px_heightHigh correlation
ram is highly correlated with price_rangeHigh correlation
sc_h is highly correlated with sc_wHigh correlation
sc_w is highly correlated with sc_hHigh correlation
three_g is highly correlated with four_gHigh correlation
price_range is highly correlated with ramHigh correlation
fc is highly correlated with pcHigh correlation
four_g is highly correlated with three_gHigh correlation
pc is highly correlated with fcHigh correlation
ram is highly correlated with price_rangeHigh correlation
three_g is highly correlated with four_gHigh correlation
price_range is highly correlated with ramHigh correlation
fc is highly correlated with pcHigh correlation
four_g is highly correlated with three_gHigh correlation
pc is highly correlated with fcHigh correlation
ram is highly correlated with price_rangeHigh correlation
three_g is highly correlated with four_gHigh correlation
price_range is highly correlated with ramHigh correlation
px_height is highly correlated with px_widthHigh correlation
sc_w is highly correlated with sc_hHigh correlation
three_g is highly correlated with four_gHigh correlation
ram is highly correlated with price_rangeHigh correlation
px_width is highly correlated with px_heightHigh correlation
four_g is highly correlated with three_gHigh correlation
pc is highly correlated with fcHigh correlation
sc_h is highly correlated with sc_wHigh correlation
price_range is highly correlated with ramHigh correlation
fc is highly correlated with pcHigh correlation
three_g is highly correlated with four_gHigh correlation
four_g is highly correlated with three_gHigh correlation
price_range is uniformly distributed Uniform
fc has 474 (23.7%) zeros Zeros
pc has 101 (5.1%) zeros Zeros
sc_w has 180 (9.0%) zeros Zeros

Reproduction

Analysis started2021-06-26 16:57:35.030838
Analysis finished2021-06-26 16:58:17.957120
Duration42.93 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

battery_power
Real number (ℝ≥0)

Distinct1094
Distinct (%)54.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1238.5185
Minimum501
Maximum1998
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum501
5-th percentile570.95
Q1851.75
median1226
Q31615.25
95-th percentile1930.15
Maximum1998
Range1497
Interquartile range (IQR)763.5

Descriptive statistics

Standard deviation439.4182061
Coefficient of variation (CV)0.3547934133
Kurtosis-1.224143883
Mean1238.5185
Median Absolute Deviation (MAD)382
Skewness0.03189847179
Sum2477037
Variance193088.3598
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15896
 
0.3%
6186
 
0.3%
18726
 
0.3%
13795
 
0.2%
13105
 
0.2%
10635
 
0.2%
8325
 
0.2%
14145
 
0.2%
14135
 
0.2%
18075
 
0.2%
Other values (1084)1947
97.4%
ValueCountFrequency (%)
5012
 
0.1%
5022
 
0.1%
5033
0.1%
5045
0.2%
5061
 
0.1%
5072
 
0.1%
5083
0.1%
5091
 
0.1%
5103
0.1%
5114
0.2%
ValueCountFrequency (%)
19981
 
0.1%
19971
 
0.1%
19962
0.1%
19952
0.1%
19943
0.1%
19931
 
0.1%
19922
0.1%
19914
0.2%
19892
0.1%
19881
 
0.1%

blue
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
0
1010 
1
990 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
01010
50.5%
1990
49.5%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
01010
50.5%
1990
49.5%

Most occurring characters

ValueCountFrequency (%)
01010
50.5%
1990
49.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01010
50.5%
1990
49.5%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01010
50.5%
1990
49.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01010
50.5%
1990
49.5%

clock_speed
Real number (ℝ≥0)

Distinct26
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.52225
Minimum0.5
Maximum3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum0.5
5-th percentile0.5
Q10.7
median1.5
Q32.2
95-th percentile2.8
Maximum3
Range2.5
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation0.8160042089
Coefficient of variation (CV)0.5360513772
Kurtosis-1.323417222
Mean1.52225
Median Absolute Deviation (MAD)0.8
Skewness0.1780841203
Sum3044.5
Variance0.6658628689
MonotonicityNot monotonic
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
0.5413
20.6%
2.885
 
4.2%
2.378
 
3.9%
1.676
 
3.8%
2.176
 
3.8%
2.574
 
3.7%
0.674
 
3.7%
1.470
 
3.5%
1.368
 
3.4%
267
 
3.4%
Other values (16)919
46.0%
ValueCountFrequency (%)
0.5413
20.6%
0.674
 
3.7%
0.764
 
3.2%
0.858
 
2.9%
0.958
 
2.9%
161
 
3.0%
1.151
 
2.5%
1.256
 
2.8%
1.368
 
3.4%
1.470
 
3.5%
ValueCountFrequency (%)
328
 
1.4%
2.962
3.1%
2.885
4.2%
2.755
2.8%
2.655
2.8%
2.574
3.7%
2.458
2.9%
2.378
3.9%
2.259
2.9%
2.176
3.8%

dual_sim
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
1
1019 
0
981 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Most occurring characters

ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11019
50.9%
0981
49.0%

fc
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct20
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.3095
Minimum0
Maximum19
Zeros474
Zeros (%)23.7%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q37
95-th percentile13
Maximum19
Range19
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.341443748
Coefficient of variation (CV)1.007412402
Kurtosis0.2770763246
Mean4.3095
Median Absolute Deviation (MAD)3
Skewness1.019811411
Sum8619
Variance18.84813382
MonotonicityNot monotonic
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
0474
23.7%
1245
12.2%
2189
 
9.4%
3170
 
8.5%
5139
 
7.0%
4133
 
6.7%
6112
 
5.6%
7100
 
5.0%
978
 
3.9%
877
 
3.9%
Other values (10)283
14.1%
ValueCountFrequency (%)
0474
23.7%
1245
12.2%
2189
 
9.4%
3170
 
8.5%
4133
 
6.7%
5139
 
7.0%
6112
 
5.6%
7100
 
5.0%
877
 
3.9%
978
 
3.9%
ValueCountFrequency (%)
191
 
0.1%
1811
 
0.5%
176
 
0.3%
1624
 
1.2%
1523
 
1.1%
1420
 
1.0%
1340
2.0%
1245
2.2%
1151
2.5%
1062
3.1%

four_g
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
1
1043 
0
957 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
11043
52.1%
0957
47.9%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
11043
52.1%
0957
47.9%

Most occurring characters

ValueCountFrequency (%)
11043
52.1%
0957
47.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
11043
52.1%
0957
47.9%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
11043
52.1%
0957
47.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11043
52.1%
0957
47.9%

int_memory
Real number (ℝ≥0)

Distinct63
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.0465
Minimum2
Maximum64
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum2
5-th percentile5
Q116
median32
Q348
95-th percentile61
Maximum64
Range62
Interquartile range (IQR)32

Descriptive statistics

Standard deviation18.14571496
Coefficient of variation (CV)0.5662307882
Kurtosis-1.21607403
Mean32.0465
Median Absolute Deviation (MAD)16
Skewness0.05788932785
Sum64093
Variance329.2669712
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2747
 
2.4%
1445
 
2.2%
1645
 
2.2%
242
 
2.1%
5742
 
2.1%
740
 
2.0%
4240
 
2.0%
4439
 
1.9%
3039
 
1.9%
637
 
1.8%
Other values (53)1584
79.2%
ValueCountFrequency (%)
242
2.1%
325
1.2%
420
1.0%
536
1.8%
637
1.8%
740
2.0%
837
1.8%
935
1.8%
1036
1.8%
1134
1.7%
ValueCountFrequency (%)
6431
1.6%
6330
1.5%
6221
1.1%
6127
1.4%
6027
1.4%
5918
0.9%
5836
1.8%
5742
2.1%
5627
1.4%
5529
1.5%

m_dep
Real number (ℝ≥0)

Distinct10
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.50175
Minimum0.1
Maximum1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum0.1
5-th percentile0.1
Q10.2
median0.5
Q30.8
95-th percentile1
Maximum1
Range0.9
Interquartile range (IQR)0.6

Descriptive statistics

Standard deviation0.2884155496
Coefficient of variation (CV)0.5748192319
Kurtosis-1.274348884
Mean0.50175
Median Absolute Deviation (MAD)0.3
Skewness0.08908200979
Sum1003.5
Variance0.08318352926
MonotonicityNot monotonic
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.1320
16.0%
0.2213
10.7%
0.8208
10.4%
0.5205
10.2%
0.7200
10.0%
0.3199
10.0%
0.9195
9.8%
0.6186
9.3%
0.4168
8.4%
1106
 
5.3%
ValueCountFrequency (%)
0.1320
16.0%
0.2213
10.7%
0.3199
10.0%
0.4168
8.4%
0.5205
10.2%
0.6186
9.3%
0.7200
10.0%
0.8208
10.4%
0.9195
9.8%
1106
 
5.3%
ValueCountFrequency (%)
1106
 
5.3%
0.9195
9.8%
0.8208
10.4%
0.7200
10.0%
0.6186
9.3%
0.5205
10.2%
0.4168
8.4%
0.3199
10.0%
0.2213
10.7%
0.1320
16.0%

mobile_wt
Real number (ℝ≥0)

Distinct121
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean140.249
Minimum80
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum80
5-th percentile86
Q1109
median141
Q3170
95-th percentile196
Maximum200
Range120
Interquartile range (IQR)61

Descriptive statistics

Standard deviation35.3996549
Coefficient of variation (CV)0.2524057562
Kurtosis-1.210376474
Mean140.249
Median Absolute Deviation (MAD)31
Skewness0.006558157429
Sum280498
Variance1253.135567
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18228
 
1.4%
18527
 
1.4%
10127
 
1.4%
14626
 
1.3%
19926
 
1.3%
8825
 
1.2%
10525
 
1.2%
19825
 
1.2%
8924
 
1.2%
14523
 
1.1%
Other values (111)1744
87.2%
ValueCountFrequency (%)
8021
1.1%
8113
0.7%
8215
0.8%
8319
0.9%
8417
0.9%
8513
0.7%
8619
0.9%
8715
0.8%
8825
1.2%
8924
1.2%
ValueCountFrequency (%)
20019
0.9%
19926
1.3%
19825
1.2%
19719
0.9%
19620
1.0%
19511
0.5%
19416
0.8%
19315
0.8%
19215
0.8%
19115
0.8%

n_cores
Real number (ℝ≥0)

Distinct8
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.5205
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median4
Q37
95-th percentile8
Maximum8
Range7
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.287836718
Coefficient of variation (CV)0.5061025811
Kurtosis-1.229749767
Mean4.5205
Median Absolute Deviation (MAD)2
Skewness0.003627508314
Sum9041
Variance5.234196848
MonotonicityNot monotonic
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
4274
13.7%
7259
13.0%
8256
12.8%
2247
12.3%
5246
12.3%
3246
12.3%
1242
12.1%
6230
11.5%
ValueCountFrequency (%)
1242
12.1%
2247
12.3%
3246
12.3%
4274
13.7%
5246
12.3%
6230
11.5%
7259
13.0%
8256
12.8%
ValueCountFrequency (%)
8256
12.8%
7259
13.0%
6230
11.5%
5246
12.3%
4274
13.7%
3246
12.3%
2247
12.3%
1242
12.1%

pc
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct21
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.9165
Minimum0
Maximum20
Zeros101
Zeros (%)5.1%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q15
median10
Q315
95-th percentile20
Maximum20
Range20
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.064314941
Coefficient of variation (CV)0.6115378351
Kurtosis-1.171498795
Mean9.9165
Median Absolute Deviation (MAD)5
Skewness0.01730615047
Sum19833
Variance36.77591571
MonotonicityNot monotonic
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
10122
 
6.1%
7119
 
5.9%
9112
 
5.6%
20110
 
5.5%
14104
 
5.2%
1104
 
5.2%
0101
 
5.1%
299
 
5.0%
1799
 
5.0%
695
 
4.8%
Other values (11)935
46.8%
ValueCountFrequency (%)
0101
5.1%
1104
5.2%
299
5.0%
393
4.7%
495
4.8%
559
2.9%
695
4.8%
7119
5.9%
889
4.5%
9112
5.6%
ValueCountFrequency (%)
20110
5.5%
1983
4.2%
1882
4.1%
1799
5.0%
1688
4.4%
1592
4.6%
14104
5.2%
1385
4.2%
1290
4.5%
1179
4.0%

px_height
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct1137
Distinct (%)56.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean645.108
Minimum0
Maximum1960
Zeros2
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum0
5-th percentile70.95
Q1282.75
median564
Q3947.25
95-th percentile1485.05
Maximum1960
Range1960
Interquartile range (IQR)664.5

Descriptive statistics

Standard deviation443.7808108
Coefficient of variation (CV)0.6879170787
Kurtosis-0.3158654936
Mean645.108
Median Absolute Deviation (MAD)318
Skewness0.6662712561
Sum1290216
Variance196941.408
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3477
 
0.4%
1796
 
0.3%
3716
 
0.3%
2756
 
0.3%
5265
 
0.2%
3275
 
0.2%
6745
 
0.2%
6675
 
0.2%
3565
 
0.2%
565
 
0.2%
Other values (1127)1945
97.2%
ValueCountFrequency (%)
02
0.1%
11
 
0.1%
21
 
0.1%
32
0.1%
43
0.1%
51
 
0.1%
61
 
0.1%
71
 
0.1%
82
0.1%
91
 
0.1%
ValueCountFrequency (%)
19601
0.1%
19491
0.1%
19201
0.1%
19141
0.1%
19011
0.1%
18991
0.1%
18951
0.1%
18781
0.1%
18741
0.1%
18691
0.1%

px_width
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct1109
Distinct (%)55.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1251.5155
Minimum500
Maximum1998
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum500
5-th percentile579.85
Q1874.75
median1247
Q31633
95-th percentile1929.05
Maximum1998
Range1498
Interquartile range (IQR)758.25

Descriptive statistics

Standard deviation432.1994469
Coefficient of variation (CV)0.3453408663
Kurtosis-1.186005229
Mean1251.5155
Median Absolute Deviation (MAD)376
Skewness0.01478747377
Sum2503031
Variance186796.3619
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8747
 
0.4%
12477
 
0.4%
13836
 
0.3%
14696
 
0.3%
14636
 
0.3%
14295
 
0.2%
17265
 
0.2%
19235
 
0.2%
12345
 
0.2%
12635
 
0.2%
Other values (1099)1943
97.2%
ValueCountFrequency (%)
5002
0.1%
5012
0.1%
5031
 
0.1%
5061
 
0.1%
5074
0.2%
5081
 
0.1%
5092
0.1%
5103
0.1%
5112
0.1%
5122
0.1%
ValueCountFrequency (%)
19981
 
0.1%
19971
 
0.1%
19961
 
0.1%
19953
0.1%
19942
 
0.1%
19921
 
0.1%
19911
 
0.1%
19901
 
0.1%
19893
0.1%
19885
0.2%

ram
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1562
Distinct (%)78.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2124.213
Minimum256
Maximum3998
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum256
5-th percentile445
Q11207.5
median2146.5
Q33064.5
95-th percentile3826.35
Maximum3998
Range3742
Interquartile range (IQR)1857

Descriptive statistics

Standard deviation1084.732044
Coefficient of variation (CV)0.5106512594
Kurtosis-1.19191307
Mean2124.213
Median Absolute Deviation (MAD)932.5
Skewness0.006628035399
Sum4248426
Variance1176643.606
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
26104
 
0.2%
22274
 
0.2%
31424
 
0.2%
14644
 
0.2%
12294
 
0.2%
3153
 
0.1%
19583
 
0.1%
12773
 
0.1%
17243
 
0.1%
37033
 
0.1%
Other values (1552)1965
98.2%
ValueCountFrequency (%)
2561
0.1%
2582
0.1%
2591
0.1%
2621
0.1%
2631
0.1%
2651
0.1%
2671
0.1%
2731
0.1%
2771
0.1%
2782
0.1%
ValueCountFrequency (%)
39981
0.1%
39961
0.1%
39931
0.1%
39912
0.1%
39901
0.1%
39841
0.1%
39781
0.1%
39711
0.1%
39702
0.1%
39691
0.1%

sc_h
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct15
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.3065
Minimum5
Maximum19
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum5
5-th percentile6
Q19
median12
Q316
95-th percentile19
Maximum19
Range14
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.213245004
Coefficient of variation (CV)0.3423593227
Kurtosis-1.190791247
Mean12.3065
Median Absolute Deviation (MAD)4
Skewness-0.09888424098
Sum24613
Variance17.75143347
MonotonicityNot monotonic
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
17193
 
9.7%
12157
 
7.8%
7151
 
7.5%
16143
 
7.1%
14143
 
7.1%
15135
 
6.8%
13131
 
6.6%
11126
 
6.3%
10125
 
6.2%
19124
 
6.2%
Other values (5)572
28.6%
ValueCountFrequency (%)
597
4.9%
6114
5.7%
7151
7.5%
8117
5.9%
9124
6.2%
10125
6.2%
11126
6.3%
12157
7.8%
13131
6.6%
14143
7.1%
ValueCountFrequency (%)
19124
6.2%
18120
6.0%
17193
9.7%
16143
7.1%
15135
6.8%
14143
7.1%
13131
6.6%
12157
7.8%
11126
6.3%
10125
6.2%

sc_w
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct19
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.767
Minimum0
Maximum18
Zeros180
Zeros (%)9.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q39
95-th percentile14
Maximum18
Range18
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.356397606
Coefficient of variation (CV)0.7554010067
Kurtosis-0.3895227894
Mean5.767
Median Absolute Deviation (MAD)3
Skewness0.6337870734
Sum11534
Variance18.9782001
MonotonicityNot monotonic
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
1210
10.5%
3199
10.0%
4182
9.1%
0180
9.0%
5161
 
8.1%
2156
 
7.8%
7132
 
6.6%
6130
 
6.5%
8125
 
6.2%
10107
 
5.3%
Other values (9)418
20.9%
ValueCountFrequency (%)
0180
9.0%
1210
10.5%
2156
7.8%
3199
10.0%
4182
9.1%
5161
8.1%
6130
6.5%
7132
6.6%
8125
6.2%
997
4.9%
ValueCountFrequency (%)
188
 
0.4%
1719
 
0.9%
1629
 
1.5%
1531
 
1.6%
1433
 
1.7%
1349
2.5%
1268
3.4%
1184
4.2%
10107
5.3%
997
4.9%

talk_time
Real number (ℝ≥0)

Distinct19
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.011
Minimum2
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB

Quantile statistics

Minimum2
5-th percentile3
Q16
median11
Q316
95-th percentile20
Maximum20
Range18
Interquartile range (IQR)10

Descriptive statistics

Standard deviation5.463955198
Coefficient of variation (CV)0.4962269728
Kurtosis-1.218590963
Mean11.011
Median Absolute Deviation (MAD)5
Skewness0.009511762222
Sum22022
Variance29.8548064
MonotonicityNot monotonic
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
7124
 
6.2%
4123
 
6.2%
16116
 
5.8%
15115
 
5.8%
19113
 
5.7%
6111
 
5.5%
10105
 
5.2%
8104
 
5.2%
11103
 
5.1%
20102
 
5.1%
Other values (9)884
44.2%
ValueCountFrequency (%)
299
5.0%
394
4.7%
4123
6.2%
593
4.7%
6111
5.5%
7124
6.2%
8104
5.2%
9100
5.0%
10105
5.2%
11103
5.1%
ValueCountFrequency (%)
20102
5.1%
19113
5.7%
18100
5.0%
1798
4.9%
16116
5.8%
15115
5.8%
14101
5.1%
13100
5.0%
1299
5.0%
11103
5.1%

three_g
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
1
1523 
0
477 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
11523
76.1%
0477
 
23.8%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
11523
76.1%
0477
 
23.8%

Most occurring characters

ValueCountFrequency (%)
11523
76.1%
0477
 
23.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
11523
76.1%
0477
 
23.8%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
11523
76.1%
0477
 
23.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11523
76.1%
0477
 
23.8%

touch_screen
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
1
1006 
0
994 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
11006
50.3%
0994
49.7%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
11006
50.3%
0994
49.7%

Most occurring characters

ValueCountFrequency (%)
11006
50.3%
0994
49.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
11006
50.3%
0994
49.7%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
11006
50.3%
0994
49.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11006
50.3%
0994
49.7%

wifi
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
1
1014 
0
986 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
11014
50.7%
0986
49.3%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
11014
50.7%
0986
49.3%

Most occurring characters

ValueCountFrequency (%)
11014
50.7%
0986
49.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
11014
50.7%
0986
49.3%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
11014
50.7%
0986
49.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11014
50.7%
0986
49.3%

price_range
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
1
500 
0
500 
2
500 
3
500 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row2
5th row1

Common Values

ValueCountFrequency (%)
1500
25.0%
0500
25.0%
2500
25.0%
3500
25.0%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
3500
25.0%
2500
25.0%
0500
25.0%
1500
25.0%

Most occurring characters

ValueCountFrequency (%)
1500
25.0%
2500
25.0%
3500
25.0%
0500
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1500
25.0%
2500
25.0%
3500
25.0%
0500
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1500
25.0%
2500
25.0%
3500
25.0%
0500
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1500
25.0%
2500
25.0%
3500
25.0%
0500
25.0%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

battery_powerblueclock_speeddual_simfcfour_gint_memorym_depmobile_wtn_corespcpx_heightpx_widthramsc_hsc_wtalk_timethree_gtouch_screenwifiprice_range
084202.201070.61882220756254997190011
1102110.5101530.7136369051988263117371102
256310.5121410.91455612631716260311291102
361512.5000100.813169121617862769168111002
4182111.20131440.614121412081212141182151101
5185900.5130220.716417100416541067171101001
6182101.7041100.813981038110183220138181013
7195400.5100240.818740512114970016351110
8144510.5000530.71747143868361099171201000
950910.612190.193515113712245131910121000

Last rows

battery_powerblueclock_speeddual_simfcfour_gint_memorym_depmobile_wtn_corespcpx_heightpx_widthramsc_hsc_wtalk_timethree_gtouch_screenwifiprice_range
1990161712.4081360.8851974314262965371000
1991188202.00111440.811381947433579198201103
199267412.9110210.219834576180911806341110
1993146710.5000180.61225088810993962151151113
199485802.2010500.1841252814163978171631103
199579410.510120.810661412221890668134191100
1996196512.6100390.218743915196520321110161112
1997191100.9111360.710883868163230579151103
1998151200.9041460.1145553366708691810191110
199951012.0151450.9168616483754391919421113